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Abstract — Symbolic execution is an effective path oriented 
and constraint based program analysis technique. Recently, 
there is a significant development in the research and ap- 
plication of symbolic execution. However, symbolic execution 
still suffers from the scalability problem in practice, especially 
when applied to large-scale or very complex programs. In this 
paper, we propose a new fashion of symbolic execution, named 
Speculative Symbolic Execution (SSE), to speed up symbolic 
execution by reducing the invocation times of constraint solver. 
In SSE, when encountering a branch statement, the search 
procedure may speculatively explore the branch without regard 
to the feasibility. Constraint solver is invoked only when the 
speculated branches are accumulated to a specified number. 
In addition, we present a key optimization technique that 
enhances SSE greatly. We have implemented SSE and the 
optimization technique on Symbolic Pathfinder (SPF). Exper- 
imental results on six programs show that, our method can 
reduce the invocation times of constraint solver by 21% to 
49% (with an average of 30%), and save the search time from 
23.6% to 43.6% (with an average of 30%). 

Keywords -symbolic execution; speculative symbolic execu- 
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I. Introduction 

Symbolic execution (SE) is a basic program analysis 
technique that was proposed more than thirty years ago [1]. 
Recently, SE draws renewed interests both from academia 
and industry partly due to the impressive progress in con- 
straint solving, related algorithms and computation power 
[2] [3] [4]. Instead of executing programs with concrete in- 
puts, symbolic execution feeds programs with symbolic 
ones, meaning that a symbolic input could initially take 
any value of the specific type. Assignment statements are 
interpreted as the manipulations of symbolic expressions. 
When encountering a branch statement, the process forks 
and both of the branches are taken. On each path, the process 
maintains a set of constraints called path condition which 
must hold along that path. For each branch, the path condi- 
tion is updated according to the corresponding condition and 
submitted to a constraint solver to check the satisfiability. In 
the context of test generation, when a path ends or a bug is 
found, the path condition can be solved to get a test case. 
For deterministic programs, the same execution path or the 
same bug can be replayed by feeding such test case as input. 
Basically, symbolic execution attempts to achieve automatic 
code comprehension by walking through the path space of 



a program. Providing that all the path conditions can be 
solved successfully, symbolic execution could cover all the 
behaviors of the program. 

In the past years, symbolic execution has shown a great 
promise in the application to automated test generation, 
proving program properties, bug detection and so on [3]. 
However, in practice, the scalability problem is still one of 
the main obstacles in applying symbolic execution to large- 
scale programs. This issue mainly stems from two closely 
related reasons: path explosion phenomenon and constraint 
solving overhead. There exists an exponential relationship 
between the number of conditions and the paths of the 
program, making exploring the whole path space infeasible 
for large-scale programs. Constraint solving is the most 
dominant in the running time of SE. When exploring deep 
paths, the path condition may be very complex, and even 
unsolvable. In addition, constraint solving overhead is almost 
always aggravated by the path explosion phenomenon. 

To alleviate the constraint solving overhead of SE, many 
techniques have been proposed. In many symbolic execution 
systems, query optimization techniques are employed to 
reduce the complexity of queries and query times. For 
example, counterexample caching stores unsatisfiable path 
conditions as counterexamples to reuse previous solving 
results [5]. Constraint independence splits a constraint set 
into independent ones, aiming to get the related constraint set 
and increase the cache hit rate [5] [6] [7] [8]. Concretization 
reduces complex constraints (such as nonlinear constraints 
[9]) into simpler ones, and is heavily used in concolic 
execution [7][8][10]. 

Although these effective techniques improve the perfor- 
mance of symbolic execution greatly in practice, constraint 
solving is still the most dominant in symbolic execution. 
According to the experiments of KLEE [5], 40% ~ 90% 
of the whole running time is spent on constraint solving. In 
the experiments of Cloud9 [11], constraint solving consumes 
more than half of the total execution time. In some experi- 
ments in S 2 E [12], almost all the running time is dominated 
by the constraint solving. 

This paper proposes a new fashion of symbolic execution, 
named Speculative Symbolic Execution (SSE), which speeds 
up symbolic execution by reducing the invocation times 
of constraint solver, and hence improves the scalability 



of symbolic execution. Unlike pure symbolic execution, 
which invokes the constraint solver immediately when a path 
condition is updated, in SSE, when a branch instruction 
is encountered, the path condition is updated accordingly, 
but the constraint solver is not necessarily invoked. The 
search procedure may advance along the path without the 
determination of feasibility until the unsolved path condi- 
tions are accumulated to a specified number. If the current 
visiting path is feasible, the procedure continues; otherwise, 
it backtracks. 

Intuitively, SSE takes branches optimistically as feasible 
ones. Path conditions are submitted to constraint solver 
in batches, not one by one as in pure symbolic execu- 
tion. When speculation succeeds, multiple invocations of 
constraint solvers are replaced by one invocation. When 
speculation fails, a backtracking mechanism will find the 
first bad branch that makes the speculation fail. Basically, 
the more feasible branches in the path space, the better SSE 
performs. 

In this paper, we give out the details of SSE algorithm and 
discuss its effectiveness. We also propose an optimization 
technique, named Absurdity Based Optimization, which is 
simple but very effective in practice. For programs with 
a high ratio of infeasible branches in the path space, this 
optimization can reduce the times of invoking constraint 
solver significantly. To some extent, our optimization is 
complementary to SSE, and can also be applied to pure 
symbolic execution. 

The contribution of this paper is three-fold. 

Firstly, we propose speculative symbolic execution, a new 
fashion of symbolic execution, to extend the scalability 
of classical symbolic execution by reducing the invocation 
times of constraint solver. We also propose absurdity based 
optimization technique to improve the reduction further. 

Secondly, we have implemented SSE and the optimization 
on top of Symbolic Pathfinder [13] to extend the scalability 
of this symbolic execution system. 

Finally, to evaluate the effectiveness of our method, we 
have conducted several experiments and find a new char- 
acteristic of the path spaces of programs. The experimental 
results show that our approach can save the search time from 
23.9% to 43.6% (with an average of 30%). Based on these 
results, we also investigate how to make our approach work 
best when applied to real world programs. 

The remainder of this paper is organized as follows. 
Section 2 introduces the background and shows the basic 
idea of SSE by motivating examples. Section 3 elaborates 
the algorithm of SSE and the absurdity based optimization 
technique. Section 4 presents our implementation on SPF 
and reports the experimental results. Finally, Sections 5 and 
6 discuss the related work and conclude. 



II. Overview 

In this section, we describe how SSE works and why SSE 
is better than pure SE by motivating examples. 

A. Background: Symbolic Execution 

Essentially, symbolic execution feeds programs with sym- 
bolic values as inputs and outputs the result as functions 
of symbolic values. A search procedure is employed to 
systematically traverse the path space of a program by 
maintaining symbolic program states. A symbolic state 
includes the symbolic values of program variables, a path 
condition and a program counter [1]. The path condition is 
a boolean formula that contains the constraints which the 
inputs should satisfy if they drive the program along the 
current path. Operations of variables are interpreted as the 
manipulations of symbolic expressions. When encountering 
a branch instruction, both of the branches are taken. For 
each branch, the corresponding condition is added into the 
path condition and a constraint solver is invoked to check 
the satisfiability of the new path condition. The process 
advances along feasible branches until the path ending 
is reached. Finally, the generated symbolic states form a 
symbolic execution tree. 

Take the program in Figure 1 for example. It computes 
the sum of the absolute values of two integers and outputs 
the result if the sum is greater than 2. Initially, the inputs 
are represented as two symbols: X and Y, and the path 
condition is (true). Execution path forks when meeting 
the branch statement if (x<0) . The constraints (X > 0) 
and (X < 0) are added to the path conditions of the two 
paths respectively. A constraint solver is invoked to check 
the feasibility of these two paths, both of which here are 
feasible. Figure 2 shows the final execution tree, in which 
symbolic states are represented as nodes. 

In this paper, we focus on how the constraint solver is in- 
voked during symbolic execution. We choose the commonly 
used depth first search (DFS) in our illustration. 

Figure 3(a) shows the path space of the example program 
with the same layout in Figure 2. The left side of a node 
corresponds to the false side of the branch statement. The 
number n marked on a branch means that the feasibility of 
the branch is determined in the n-th invocation of the solver. 
Totally, 14 times of constraint solving are needed. 

int x, y; 

1 : if (x < 0) 

2 : x = - x ; 
3: if (y < 0) 

4: y = -y; 

5 : x = x + y ; 

6: if (x > 2) 

7: //if(x > y) 

8 : output (x) ; 

Figure 1. An Example Program and Its Execution Tree 
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Figure 2. An Example Program and Its Execution Tree 



B. Motivating Examples of SSE 

When encountering a branch statement, SSE may advance 
along the two branches without checking the feasibility. 
The constraint solver is invoked only when the number of 
unchecked branches reaches a specific number, say max 
speculation depth. If the constraint solver gives a positive 
result, it means that the speculation succeeds. Otherwise, we 
need backtrack to the last feasible branch. Now we present 
how speculation reduces solving times in a DFS manner 
with the example in Figure 1. 

The initial symbolic state of the program under SSE is the 
same as that under SE. Assuming that the max speculative 
depth is set as 3, for branch statement if (x<0), the 
procedure advances along the else side without checking 
feasibility. Branches of the statement if (y<0) are han- 
dled similarly. When the procedure takes the else branch 
of statement if (x>2) speculatively, the max speculation 
depth is reached, therefore a constraint solver is invoked. 
Since the path segment from root to state A (in Figure 2) 
is executed speculatively, we call this segment a speculation 
segment. As a result, only one time of constraint solving is 
enough to know the feasibility of the three branches on path 
#1. As shown in Figure 3(b), the number n associated on a 
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Figure 3. Constraint Solving in DFS 



Figure 4. Constraint Solving in SSE by DFS With Backtracking 

branch demonstrates the feasibility of the branch is known 
in the n-th solving. The invocations of constraint solver only 
occur at the branches marked with bracket numbers. In all, 
only 8 queries are needed, saving nearly half of that in pure 
SE. 

Now consider commenting line 6 and uncommenting line 
7 in the example program in Figure 1 . Path #5 and #7 would 
be infeasible. In Figure 4, they are marked with a cross. 
In this case, the number of constraint solving under pure 
SE is still 14. In SSE, as shown in Figure 4(a), the result 
of the 5th time of solving with path condition (X > A 
Y < A X < Y) is unsat. In the sequel, the backtracking 
mechanism analyzes the current speculation segment (i.e., 
from root to point A) in a binary search way to find the first 
infeasible branch, which spends two extra times (6th and 
7th) of constraint solving. Then the procedure backtracks 
to point A and continues on path #6. Constraint solving on 
path #7 is similar to that on path #5. Finally, 11 times of 
constraint solving with 9 sat and 2 unsat are performed, 
saving 3 out of 14 in pure SE. 

It is worth noting that the result of SSE is related to 
the order in which the path space is explored. Consider 
exploring the path space from right to left, i.e., exporing 
the true side of a branch statement first. As shown in Figure 



4(b), only 8 times of constraint solving is enough. 

III. Speculative symbolic execution 

One can imagine using different search styles in SSE. 
In this section, we present the speculative DFS algorithm 
that combines speculation and DFS, and the absurdity based 
optimization. Then we discuss the effectiveness of our 
approach. 

A. Speculative DFS Algorithm 

Figure 5 shows the algorithm of speculative DFS, includ- 
ing the main search procedure and the backtrack pro- 
cedure. The algorithm traverses the path space of program 
by DFS and performs speculation with a specially designated 
backtracking mechanism. A StateStack is maintained 
to store the symbolic states on the current path. Initially, 
the initial symbolic state of the program is pushed into the 
StateStack. The while loop expands the top element 
of the StateStack until the stack is empty. The procedure 
forwards by symbolically executing the next statement of the 
top state in the StateStack repeatedly. For a non-branch 
statement, our algorithm performs identically with pure SE. 
When processing a branch statement, if the current specula- 
tion depth has not reached the maxSpeculationDepth, 
the branch is taken without checking feasibility and a 
new state with updated path condition is pushed into the 
StateStack directly as shown in line 10. Otherwise, as 
shown in line 12, function checkFeasibility ( ) checks 
the satisfiability of the current path condition. If the result 
is sat, the current state is pushed into the StateStack 
and a new speculation segment starts. If the result is unsat, 
the backtrack ( ) procedure cuts the infeasible branches 
away. The procedure backtracks when reaching the end of a 
path. According to the feasibility of the last speculation seg- 
ment on the path, the backtrack () procedure performs 
differently. Note that, when the maxSpeculationDepth 
is set as 1, this algorithm is equivalent to pure SE. 

The backtracking procedure performs differently in dif- 
ferent cases to suit for the context of speculation. For a 
failed speculation with k branches, the speculation segment 
before the last branch (already known as infeasible) is 
analyzed to find the first infeasible branch. We adopt the 
binary search strategy for its stable performance in different 
cases, which needs at most [log 2 (/c — 1)] times of constraint 
solving. Line 34 deals with another case when the path ends 
with a reachable state, the procedure backtracks to the last 
unexplored branch. 

B. Eliminating False Alarms 

Although SSE generates the same execution tree as pure 
SE, in practice, bugs located in dead code may cause SSE 
yielding different analysis results from pure SE. Consider 
the example shown in Figure 6, line 5 contains a 'divide-by- 
zero' bug; however, it is unreachable since its path condition 



1: search (int maxSpeculationDepth) { 

2: StateStack = {initial state); 

3: while (StateStack not empty) { 

4: s=get next statement; 

5: if (s is non-branch statement) 

6: perform as pure symbolic execution; 

7: else if(s is branch statement) { 

8: choose one unexplored branch; 

9: if (not reach maxSpeculationDepth) { 

10: pushStateO; 

11: } else ( 

12: checkFeasibility () ; 

13: if (feasible) (//speculation succeeds 

14: pushStateO ; 

15: start new speculation segment; 

16: } else { // speculation fails 

17 : backtrack ( ) ; 

18: } 

19: } 

20: ) 

21: if (path ends) {// path end 

22: if (in speculation segment) 

23: checkFeasibility () ; 

24: backtrack (); 

25: } 

26: ) 

27: } 

28: backtrack() { 

29: if (speculation fails) { 

30 : binarySearchFirstBadBranch ( ) ; 

31: pop unreachable states; 

32: backtrack to the last feasible branch; 

33: } else ( 

34: backtrack to the last unexplored branch; 

35: } 

36: } 



Figure 5. Speculative DFS Algorithm 

(a = b A a ^ b) is unsatisfiable. In SSE, providing that the 
two branch statements in line 2 and line 3 are in a same 
speculation segment, line 5 would be executed without a 
prior determination of its reachability. Hence a false alarm 
will be reported. 

1: int a, b; 

2 : if (a == b) { 

3: if (a != b) { 

4: // unreachable bug 

5: a = a/0; 

6: } 

7: } 

Figure 6. Another Example Program 

Technically, this issue can be simply addressed via check- 
ing the reachability of the potential bug point just before 
generating the bug report. It is necessary to point out that 
the exceptions caused by constraint solving (such as caused 
by the constraints beyond the ability of constraint solver) 
should be handled carefully, because a repeated reachability 
checking would trigger the same exception again. In this sit- 
uation, the speculation segment should be checked carefully 
to find the first solvable and feasible branch, if any. 

C. Correctness 

We define the correctness of SSE as 



"Speculative symbolic execution generates the same exe- 
cution tree as pure symbolic execution in the end". 

Here we only give an informal description of the correct- 
ness of the speculative DFS algorithm. 

With respect to the states of the generated execution tree, 
speculative DFS only differs from DFS under pure SE on 
that it may touch states that are unreachable in pure SE. 
Therefore, to show the correctness of speculative DFS, it 
suffices to prove the following two points: all the states with 
unsatisfiable path condition touched by speculative DFS will 
be finally cut away from the execution tree and all the cut 
states have unsatisfiable path condition. 

On one hand, suppose that state s is an unreachable state 
in pure SE but touched in speculative DFS. There exists 
a time that state s is the current visited state {i.e., at the 
top of StateStack). Let s\,...,Sk(k > 1 A Sk = s) be 
the corresponding speculation segment ending with state s. 
There are the following four cases need to consider in the 
while loop. 

• Case 1: State s is the end of a path. Line 23 in Figure 
5 checks the feasibility of the current state and gets a 
negative result. Then in the backtracking procedure, line 
30 analyzes the speculation segment and line 3 1 and 32 
backtracks to the last feasible branch. All the states in 
si, Sfe with unsatisfiable path condition would be cut 
off from the execution tree. 

• Case 2: The next statement of state s is not a branch 
statement and, 

• Case 3: The next statement of state s is a branch 
statement and the max speculation depth has not 
been reached. Speculation segment s\,...,Sk will be 
expanded by the while loop in Figure 5 without 
determining the feasibility until a path end or the max 
speculation depth is reached. Suppose the expanded 
speculation segment is si, Sk, s m . If s m is a path 
end, the argument is similar as case 1. Otherwise, s m 
reaches the max speculation depth. Line 12 checks the 
path condition of s m . Since Sk is unreachable, s m 
must be also unreachable. Then line 17 invokes the 
backtracking procedure and sequentially line 30 finds 
the unreachable states in si, Sk, s m , which are in 
turn cut off in line 3 1 . 

Thereby, we claim that all the states with unsatisfiable 
path condition touched by speculative DFS will be finally 
cut away from the execution tree. 

On the other hand, the only place in our algorithm where 
states are cut off from the execution tree is line 31. Before 
that, line 30 has distinguished the reachable states from the 
unreachable ones, so we claim that all the cut states have 
unsatisfiable path condition. 

D. Feasibility 

The only possible thing that brings risk to the feasibility 
of SSE is that SSE executes dead code which are never 



executed in pure SE. Our backtracking mechanism guaran- 
tees that all the infeasible states will be cut away from the 
execution tree. However, in practice, there may exist some 
speculatively executed dead codes that bring influence to the 
unbacktrackable components of the system (such as updating 
a database). In such case, when the program behaviors are 
impacted by these components, SSE may get a different 
result from pure SE. In fact, for such kind of programs, 
pure SE may not work either. 

This issue can be addressed by blocking the influence of 
speculatively executed instructions. One typical technique is 
providing appropriate support for symbolic execution (such 
as environment modeling [5]) to make the system more 
backtrackable. 

E. Absurdity Based Optimization 

SSE treats an unexplored branches as feasible one at 
its first glance and backtracks when a speculation fails. 
This feature implies that the more feasible branches in the 
execution tree, the better SSE performs. Meanwhile, this 
feature also implies that SSE is not good at handling the 
programs with a high ratio of infeasible branches since too 
many backtrackings might negate the benefits brought by 
successful speculation. To address this problem, we propose 
a simple but effective optimization, absurdity based opti- 
mization, which is complementary to SSE for its effective- 
ness on the programs with a high ratio of infeasible branches. 
This optimization is based on the following proposition. 

Proposition 1: Regardless of runtime errors, given a 
reachable branch statement, at least one of its branches is 
feasible. 

This proposition comes from the well-known Reductio AD 
Absurdum in first order logic [14], which says that if T; ip is 
inconsistent, then T \= -up, where T is a set of well-formed 
formulae (wff) and ip is a wff. In the context of symbolic 
execution, for instance, let state s be a reachable state with 
a satisfiable path condition (cpi A ... A <p n ). Suppose the 
next statement is a two-choice branch statement, say if (</>), 
where <fi is a boolean condition. If the search procedure has 
explored the then branch and find that it is infeasible, i.e., 
the constraints set {(pi, <p n } and <f) is inconsistent, then 
we can deduce that ip\,...,ip n \= Therefore, without 
querying the constraint solver, we know that the else 
branch is feasible. 

This simple optimization is applicable both to pure and 
speculative symbolic execution. In practice, most of the 
branch instructions used in programs only have two choices. 
Therefore, as soon as an infeasible branch is explored before 
its counterpart, one invocation time of constraint solver can 
be saved. A high ratio of infeasible branches in the path 
space can provide many chances to perform this optimiza- 
tion. Consider the example in Figure 1 (comment line 6 and 
uncomment line 7), if applied with our optimization, the 
8th and 11th times of constraint solving are unnecessary. 
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Figure 7. Speculative DFS With Absurdity Based Optimization 

As shown in Figure 7, branches where constraint solving is 
saved are marked with asterisk. 

Absurdity based optimization is also related to the order 
in which the execution tree is explored. If the path space in 
Figure 7 is explored from right to left, since no infeasible 
branch is explored before its counterpart, no information can 
be used to perform optimization. Thereby, we always attempt 
to explore the infeasible side first in practice. 

F. Discussion 

In this subsection, we first explain the benefits and cost of 
SSE, then we discuss what factors influence the effectiveness 
of SSE, and finally, we take a theoretical analysis on the 
speculative DFS algorithm. 

The benefit brought by SSE is the saved constraint solv- 
ings when speculation succeeds. A successful speculation on 
a speculation segment of length k only need once constraint 
solving, saving k — 1 times compared with pure SE. 

The cost of our approach lies in failed specula- 
tions. Consider a speculation segment with k branches 
bi, bi, 6fe(l < i < k), where branches after bi (includ- 
ing bi) are infeasible ones, the corresponding path conditions 
are pi, Pi, ...,Pk- In SSE, the instructions between bi and 
bk are executed speculatively, which consumes extra time 
and memories. In addition to the first time of solving on pk, 
binary search between p\ ~ pk-i to find backtracking point 
needs at most [~log 2 (fc — 1)] times of queries. This may be 
more expensive than solving for path conditions p\ ~ pi in 
pure SE when i is small. 

The effectiveness of our approach is influenced by the 
characteristics of the program under analysis. Specially, 
there are the following factors: 

• The ratio of infeasible branches in the path space. 
SSE is suitable for the programs with a high ratio of 
feasible branches. For the programs with a high ratio 
of infeasible branches, SSE can be improved by the 
absurdity based optimization. Generally, this factor is 
the most important one. 

« The shape of the path space. SSE is also related to the 
shape of the path space. For example, the continuous 
branches on the same direction (i.e., all left turning or 
right turning) in the execution tree could increase the 
success rate of speculation. 

• The exploration order over the path space. As discussed 
before, both SSE and optimization depend on the 



exploration order. 

• The complexity of path conditions. The reduction of 
constraint solving for complex constraints can make 
SSE more useful. 

• The proportion of the constraint solving time in the total 
running time ofSE. We only attack the constraint solv- 
ing part of SE. Therefore, the proportion of constraint 
solving time in the total running time of SE influences 
our ultimate goal. 

The upper bound of speculative DFS algorithm is speci- 
fied by the following proposition. 

Proposition 2: The times of constraint solving in specu- 
lative depth first search are larger than half of that in pure 
symbolic execution. 

The proof of Proposition 2 is shown in the appendix. 
Specially, when a path space is a full binary tree with height 
n (the number of branches in the longest path), in pure 
SE, the times of constraint solving is 2" +1 — 2 (equal to 
the number of branches in the tree). While in speculative 
DFS, let k be the max speculation depth, then the times 
of constraint solving T„ can be quantified by the following 
equation: 
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The proof of Equation (1) is shown in the appendix. For a 
full binary tree, speculative DFS performs best when n < k, 
saving nearly a half of the constraint solving times. When 
n > k, our algorithm gets better with the increase of k. 

Speculative DFS performs worst when the execution tree 
only consists of a single path. In this case, although too 
many backtrackings affect the performance, our optimization 
technique can help to improve SSE. It is hard to take a 
precise analysis for the worst case because of the irregularity 
of the path spaces of programs. 

In fact, both of the best case and worst case hardly happen 
in practice, more experimental evaluation is described in the 
next section. 

IV. Implementation and Experimental 
Evaluation 

A. Implementation 

We have implemented the speculative DFS algorithm 
and the absurdity based optimization on top of Symbolic 
PathFinder (SPF) [13] with Java PathFinder (JPF) v6.0 
[15][16]. JPF is an open source model checker for Java 
bytecode. It mainly consists of a Java Virtual Machine 
to support state storing, state matching and backtracking, 
as well as an adaptive search engine to systematically 
explore program states. Symbolic PathFinder (SPF) is built 
as an extension of JPF. SPF implements symbolic version 
semantics for Java bytecode instructions and uses JPF to 



systematically explore the execution tree of program under 
analysis. The features of our implementation are as follows. 

• New search strategy. We have implemented the spec- 
ulative DFS algorithm as a new search strategy, named 
SpeculativeSegmentDFSearch, to explore the 
execution tree of a program speculatively. The back- 
tracking mechanism in Figure 5 is employed in the new 
search strategy. 

• New choice generator. We have designed a new 
class SpecuPCChoiceGenerator, which is inher- 
ited from PCChoiceGenerator. The new choice 
generator is utilized to help backtracking and perform- 
ing the absurdity based optimization. 

• New semantics of branch instructions. To Support 
speculative execution, the semantics of branch instruc- 
tions are adapted. Each branch instruction generates 
an instance of class SpecuPCChoiceGenerator. 
Speculation is performed according to the current spec- 
ulation depth as shown in Figure 5. 

• Eliminating false alarms. There exist four kinds of 
false alarms caused by SSE in SPF: runtime errors in 
the analyzed program, property violations, user defined 
exceptions and crashes caused by the program under 
analysis. We have handled all these issues in our 
implementation. 

To use SSE in SPF, users need to configure SPF 
to use the speculative DFS strategy (using the property 
search. class) and specify the max speculation depth 
(using the property symbolic . speculative . depth). 

B. Experiments 

To evaluate SSE, we have conducted some experiments. 
The objective of the experiments is to investigate the fol- 
lowing research questions. 

a. Effectiveness and cost. How about the effectiveness 
and cost of SSE compared with pure SE? 

b. Speculation depth. How does the value of the max 
speculation depth influence the results and what is the 
optimal speculation depth for a real-world program? 

c. Exploration order. In speculative DFS, the execution 
tree can be explored from two directions, false-side-first and 
true-side-first order, which one is better? 

1 ) Experimental Setup: We choose five programs that are 
often used in the experiments related to JPF [13][17][18]. 
WBS, the Wheel Brake System, comes from the automotive 
domain [18]. The rest are all Java data structure programs: 
red-black tree (TreeMap), binary search tree (BinTree), 
binomial heap (BinHeap) and Fibonacci heap (FibHeap) 
[17]. In addition, we write a data structure program List, 
which implements a double linked list with sorted elements. 
For data structure programs, we use parameterized testing 
[17] [19] to generate random call sequences of a limited 
length. The lines of these programs range from 230 (for 



BinTree) to 477 (for TreeMap). The ratios of the in- 
feasible branches in the path spaces range from 0% (for 
WBS) to 42% (for List). We choose these programs in 
our experiments for two reasons. Firstly, these programs are 
often used in the experiments related to JPF. It is reasonable 
to choose these programs as the benchmark to evaluate SSE. 
Secondly, the effectiveness of SSE is heavily influenced 
by the ratio of infeasible branches in the path space of 
a program. For our selected programs, the ratios of the 
infeasible branches cover different levels. In fact, 42% (for 
List) is pretty high. Since each reachable branch has at 
least one feasible side, this ratio can never be higher than 
50% if each branch only has two sides. 

We conduct different experiments to investigate the afore- 
mentioned research questions. For each program, we per- 
form four kinds of analysis: pure SE with/without opti- 
mization and SSE with/without optimization. In each kind 
of analysis, we vary the value of the max speculation 
depth and the exploration order independently. For each 
program, the max speculation depth is increased from 2 
to the execution depth of the program. In fact, setting the 
max speculation depth larger than the execution depth yields 
the same analysis results as setting that as the execution 
depth, because in such cases speculation segments always 
end because of path ending. We use Yices [20] as the 
constraint solver because of its high performance and good 
usability. All of the experiments are carried out on an Intel 
Core i7 2.80GHz computer with 8 GB of RAM. 

2) Results: 
a. Effectiveness and Cost 

Table I shows part of the experimental results of three 
kinds of analysis: pure SE, SSE and SSE with optimization. 
The first column shows the name of each program associated 
with its corresponding call sequence length if any. We only 
list the best case and the worst case of SSE (measured by 
the search time) when the max speculation depth varies 
from 2 to the maximum value. The corresponding max 
speculation depth is shown after the notation 'B.' and 'W.'. 
The third and fourth columns show the numbers of different 
constraint solving results and the percentage of unsat results 
respectively. Columns 5 and 6 show the total search time 
and the percentage of the time spent on constraint solving 
(the average of three runs). The executed instructions are 
presented in the last column to show the cost of SSE. All 
the results shown in Table 1 are collected under the true- 
side-first exploration order. 

SSE (without optimization) performs best for WBS, which 
has no infeasible branches in the execution tree. SSE reduces 
the times of constraint solving by 49% in the best case and 
35% in the worst case. The search time is saved by 43.6% 
and 32% respectively. SSE performs worst for the program 
List. In the best case, SSE reduces 5% of the times of 
constraint solving and 6% of the search time. In the worst 
case, SSE brings extra 7% of the times of constraint solving 



Table I 

Experimental Results (specu. dep.=max speculation depth, B.=Best, W.=Worst) 



Program 
(call seq. 
length) 


Analysis 
(specu. 
dep.) 


#sat/unsat/all 
(Savings) 


% 
unsat 


Search 
Time(s) 
(Savings) 


Solving 
Time(s) 
(Savings) 


Solving 
Time 
ratio 


instruction 
(extra) 


WBS 


pure SE 


27646/0/27646 


0% 


66.2 


62.9 


95% 


1382246 


„ w B.(10) 
W.(2) 


14174/0/14174(49%) 
17886/0/17886(35%) 


0% 
0% 


37.5(43%) 
44.9(32%) 


34.3(45.4%) 
41.8(33.5%) 


91% 

92% 


1382246(0%) 
1382246(0%) 


SSE+ B.(10) 
Opi. W.(2) 


14174/0/14174(49%) 
17886/0/17886(35%) 


0% 
0% 


37.3(43.6%) 
45(32%) 


34.1(45.7%) 
42(33.2%) 


91% 

93% 


1382246(0%) 
1382246(0%) 


TreeMap 

(5) 


pure SE 


27005/17261/44266 


39% 


80 


74.7 


93% 


855119 


SSE B(2) 
W.(5) 


18569/22045/40614(8%) 
20096/23772/43868(1%) 


54% 
54% 


72.2(9.7%) 
79.4(0.8%) 


65598(12%) 
71515(4%) 


91% 

90% 


1077553(26%) 
1548222(81%) 


SSE+ B.(2) 
Opi. W.(5) 


11527/23561/35088(21%) 
13187/23829/37016(16%) 


67% 
64% 


61.1(23.6%) 
66.4(17%) 


54.6(27%) 
59(21%) 


89% 
89% 


1159619(35.6%) 
1549553(81.2%) 


BinTree 

(5) 


pure SE 


22381/15589/37970 


41% 


76.6 


72.2 


94% 


381092 


SSE B(2) 
W.(6) 


15913/19215/35128(7.5%) 
16841/20975/37816(0.4%) 


55% 
55% 


70.5(8.1%) 
77(-0.5%) 


65.4(9.4%) 
70.8(2%) 


92% 
92% 


578416(52%) 
980918(157.4%) 


SSE+ B.(2) 
Opi. W.(10) 


9191/20086/29277(23%) 
9860/20998/30858(19%) 


69% 
68% 


57.7(25%) 
61.6(20%) 


52.4(27.4%) 
55.7(22.8%) 


91% 

90% 


677685(78%) 
984040(158%) 


BinHeap 

(6) 


pure SE 


164116/23576/187692 


13% 


410 


371 


90% 


21809086 


SSE B - (21) 
^ W.(2) 


114948/38188/153136(18.4%) 
125178/32932/158110(15.8%) 


25% 
21% 


335.9(18.1%) 
345.6(15.7%) 


292.3(21%) 
306.8(17%) 


87% 
89% 


29950152(37.3%) 
24138598(10.7%) 


SSE+ B.(21) 
Opi. W.(2) 


96600/38202/134802(28.2%) 
102410/34164/ 13o574(27. 2%) 


28% 

2O70 


300(26.8%) 
303. 9(25. 97c) 


257.5(30.6%) 
2o4.9(28.o7o) 


79% 

oi Of 
Ol70 


29950152(37.3%) 
2576o426(18.1%>) 


FibHeap 

(6) 


pure SE 


58014/9142/67156 


14% 


148.5 


133 


90% 


8098034 


SSE B(2) 
^ W.(8) 


44498/10898/55396(18%) 
40302/15848/56150(16%) 


20% 

28% 


125.2(16%) 
130(13%) 


110(17%) 
112.6(15%) 


88% 
87% 


8416826(3.9%) 
10731504(32.5%) 


SSE+ B.(2) 
Opi. W.(10) 


37694/ 1 1906/49600(26%) 
33896/16160/50056(25%) 


24% 
32% 


113(23.9%) 
117(21.2%) 


97.5(26.7%) 
100(24.8%) 


86% 
85% 


8859148(9.4%) 
10731504(32.5%) 


List 

(6) 


pure SE 


128076/94380/222456 


42% 


520.6 


501.5 


96% 


2842969 


SSF B ^ 2) 
^ W .(7) 


104299/108116/212415(5%) 
118384/121056/239440(-7%) 


51% 
51% 


489.3(6%) 
561.4(-8%) 


467.7(7%) 
533.6(-6%) 


96% 
95% 


3832245(34.8%) 
7311109(157.2%) 


SSE+ B.(2) 
Opi. W.(20) 


33488/116635/150123(32.5%) 
38705/121176/159881(28.1%) 


78% 
76% 


325(37.6%) 
354.1(32%) 


303(39.6%) 
327.7(34.7%) 


93% 
93% 


5371823(89%) 
7333909(157.9%) 



and 8% of the search time. The reason is that, the high 
ratio of infeasible branches (42%) causes too many failed 
speculations, which negate the benefit brought by successful 
speculations. In average, SSE (without optimization) reduces 
the search time by 16.8% in the best case and by 8.8% in 
the worst case. 

SSE with optimization outperforms SE and SSE for all 
programs. The optimization brings the most benefits for 
List, making SSE reduce 32.5% of the times of constraint 
solving and 37.6% of the search time in the best case. This 
is because the high ratio of unsat branches provides a lot 
of chances to perform optimization. As expected, for WBS, 
our optimization brings no benefit because no infeasible 
branches can be used. In average, SSE with optimization 
reduces 30% of the times of constraint solving and 30% of 
the search time in the best case, and 25% of the times of 
constraint solving and 24.7% of the search time in the worst 
case. 

The results in column 7 shows that, in both of pure SE 
and SSE, constraint solving dominates most of the search 
time. The percentage of the time spent on constraint solving 
in the search time is reduced slightly by SSE. 

The last column shows the number of executed instruc- 
tions in different analysis. We can see that, despite executing 



a plenty of extra instructions, SSE is still faster than pure 
SE. Another important point is that SSE nearly does not 
consume extra memories than pure SE. The reason is that 
speculative DFS only spends extra memories to store the 
states in failed speculation segments, which can be ignored 
in our experiments, 
b. Speculation Depth 

Figure 8 shows how the max speculation depth impacts 
the times of constraint solving in SSE (without optimiza- 
tion). Results for larger speculation depths are omitted 
since they are nearly the same as the tails of the lines. 
For the program P, let Tp be the times of constraint 
solving in pure SE, and let Tp be the times of constraint 
solving in SSE with the max speculation depth k. Figure 8 
shows the result of Tp/T p x 100%. For List, TreeMap, 
BinTree and FibHeap, the optimal speculation depth is 
2. Particularly, for List, SSE brings benefit only when the 
max speculation depth is 2. This is because the high ratio 
of infeasible branches causes too many backtrackings. For 
BinHeap, the optimal speculation depth is 6. For the pro- 
gram without infeasible branches (WBS), the results decrease 
monotonously with the increase of max speculation depth 
since the speculations never fail. Figure 9 shows the impact 
of the max speculation depth in SSE with optimization, in 
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Figure 8. Impact of Max Speculation Depth in SSE 
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Figure 10. Difference of Search Time Between Different Exploration 
Orders in SSE 
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Figure 9. Impact of Max Speculation Depth in SSE With Optimization Orders in SSE With Optimization 



which the optimal speculation depths for different programs 
are the same as that in Figure 8. Generally, regardless of 
the tiny fluctuation in the tail, the optimal speculation depth 
ranges from 2 to 6 and shifts from small to big when the 
ratio of infeasible branches decreases. 

We can see that, the optimization technique improves SSE 
significantly. Another interesting observation is that the re- 
sults become stable when the max speculation depth reaches 
a threshold. This also demonstrates that our backtracking 
mechanism is quite efficient. Besides, the impacts of the 
max speculation depth on the search time are not shown 
because they are nearly the same as that on the times of 
constraint solving, 
c. Exploration Order 

Figure 10 illustrates the difference of the search time 
under two different exploration orders in SSE without opti- 
mization. For a program P, let tj be the search time of pure 
SE in false-side-first order, be the search time of SSE with 
true-side-first order and vj be the search time of SSE with 
false-side-first order, where k is the max speculation depth. 
Figure 10 shows the result of {th - x 100%. We can 

observe that there exists a distinct advantage of false-side- 
first order. Figure 1 1 shows the same calculation under SSE 
with optimization. In this case, the true-side-first order is 
slightly better than the false-side-first order, especially when 
the speculation depth is set as the optimal value. 

To find the reasons, we collect the constraint solving 
results of the two sides of the branches under pure SE. 



The results are shown in Table II. Column 2 to 5 show 
the constraint solving results of the two sides in the whole 
execution tree. Column 6 to 9 show the constraint solving 
results of the branches with equation constraints. We can see 
that, the true side has a higher probability to be infeasible 
in comparison with the false side. The reason is twofold. 
Firstly, for branches with equation constraints, the true 
sides have a more than two times higher probability to 
be infeasible than the false sides. The equation constraint 
makes the path space along the true side narrower. Secondly, 
the ratio of the infeasible true sides of the branches with 
inequation constraints are also higher. We argue that this 
stems from the characteristics of programs. In programming 
practice, special cases are usually handled in the then 
branch and other cases are put in the else branch. It is 
reasonable to think that the then branch is easier to be 
infeasible. This finding implies that the execution trees of 
programs are not bilateral symmetry, but tend to incline to 
the false sides of the branches. 

As a result, the higher rate of feasible false side branches 
endows SSE in false-side-first order with a higher success 
rate of speculation, while the lower rate of feasible true side 
branches provides more information to the optimization. In 
summary, the dissymmetry of the path space of programs 
makes SSE with optimization in the true-side-first order the 
optimal in our experiments, since the shape of the execution 
tree can be leveraged to the utmost extent. 



Table II 

#CONSTRAINT SOLVING RESULTS OF TWO SIDES 



Program 


#feasible 
true sides 


#infeasible 
true sides(%) 


#feasible 
false sides 


#infeasible 
false sides(%) 


#feasible 
true sides 
of equation 


#infeasible 

true sides 

of equation(%) 


#feasible 
false sides 
of equation 


#infeasible 
false sides 
of equation(%) 


WBS 


13823 


0(0%) 


13823 


0(0%) 


3005 


0(0%) 


3005 


0(0%) 


List 


30276 


80952(73%) 


97800 


13428(12%) 


24660 


35328(58.9%) 


46560 


13428(22.4%) 


TreeMap 


11567 


10566(48%) 


15438 


6695(30%) 


5601 


7048(55.7%) 


9484 


3165(25%) 


BinTree 


9214 


9771(52%) 


13167 


5818(31%) 


3839 


6095(61.4%) 


7793 


2141(21.6%) 


BinHeap 


70270 


23576(25%) 


93846 


0(0%) 





O(-) 





O(-) 


FibHeap 


25006 


8572(26%) 


33008 


570(2%) 





o(-) 





O(-) 



C. Threats to Validity 

The main validity problems need to consider in our ex- 
periments are threats to the external validity, which include 
two aspects: the chosen programs and the implementation 
platform. 

We chose 6 programs in our experiments, 5 of which 
are often used in the experiments related to JPF. The 
characteristics of the path spaces of these programs influence 
the results definitely and our selected programs may not 
be representative. The ratios of the infeasible branches in 
our chosen programs range from to 42%. From this 
perspective, our subjects are quite representative. We limit 
the call sequences length for data structure programs to 
control the running time of the experiments. Longer bounds 
would make constraint solving more time-consuming and 
may affect the results. The conditions of some branches 
in data structure programs are heap constraints. We do not 
perform speculation for heap constraints because the sub- 
sequent instructions heavily depends on the condition. We 
believe that for other types of programs, such as numerical 
programs or control programs (WBS in our experiments), the 
results may be better. 

We selected SPF as the implementation platform and 
Yices as the constraint solver. A different selection may 
yield different running time, whereas the times of constraint 
solving would not change. 

V. Related Work 

Our work is inspired by the speculation execution used 
in pipelined processors [21], which predicts the outcome 
of a branch and issues the subsequent instructions before 
the actual branch outcome is known. SSE also executes the 
instructions after a branch before the feasibility of the branch 
is known. That is why we use the term speculative symbolic 
execution in this paper. 

Speculation is used to improve performance in many other 
systems, such as operating systems [22], distributed file 
systems [23]. An essential difference between these systems 
and the work proposed in this paper is that, the performance 
improvement brought by our method stems from the de- 
crease of the execution times of special operations, rather 
than the better parallelization brought by speculation in other 
systems mentioned above. To the best of our knowledge, 



we are the first to conduct systematic research on using 
speculation in symbolic execution. 

Our work is also related to the large body of work on the 
scalability problem in symbolic execution, which stems from 
two reasons: path explosion problem and constraint solving 
overhead. To attack the path explosion problem, researchers 
have proposed to use path pruning [24] [25] [26], compo- 
sitional method [27], abstraction [28], state merging [29], 
parallelism [11] [18] and so on to improve path exploration. 
To alleviate the constraint solving overhead, a plenty of work 
have been proposed [5][6][7][8][9][10][19][30]. Generally, 
the optimization techniques employed in current symbolic 
execution systems attack the constraint solving overhead 
by query simplification, reusing previous results or fast 
checking before constraint solving [6] [5] [7] [8]. 

The work proposed in this paper is an orthogonal and 
complementary approach. SSE employs a new fashion of 
path exploration technique, aiming to attack the constraint 
solving overhead by reducing the invocation times of con- 
straint solver. SSE neither reduces the complexity of the 
queries submitted to the solver, nor caches constraints to 
reuse previous constraint solving results. SSE reduces the 
constraint solving overhead from a unique perspective. 

Lei Bu et al. use the idea of speculation in [31] for the 
reachability checking of linear hybird automata. Different 
from their work, the speculation in SSE is limited by a 
specific number, but the speculation in [31] stops when 
a target location is reached, which is more like a target 
driven 'slicing'. This is caused by different contexts of 
using speculation. Another difference is the backtracking 
mechanism. We use binary search to find backtracking 
points, whereas the irreducible infeasible set technique [32] 
is employed in [31] for backtracking. For SSE, binary 
search is stable and effective. Nevertheless, the minimal 
unsatisfiable core extraction technique [33] may also be used 
in the backtracking of SSE to reduce the times of constraint 
solving further. 

At the time of this writing, EPFL released S 2 E vl.2 
[34]. An optimization named speculative forking is used 
in the concolic execution, where symbolic states are forked 
without regard to the feasibility at the branch that depends 
on symbolic values. These speculatively generated states 
are used as backtracking points (if feasible) to avoid re- 



execution from scratch when new inputs are generated. 
Although we both use the similar term, speculation is used 
to achieve different goals. 

The philosophy of SSE is a little similar to that used in 
many static analysis systems, which consider extra program 
behaviors and eliminate false alarms in the end [35] [36]. 
The difference is that SSE prunes the infeasible behaviors 
in an appropriate chance to keep results precise and make a 
good tradeoff between the cost and the benefit as well. 

VI. Conclusion and Future Work 

We have proposed a new fashion of symbolic execu- 
tion named speculative symbolic execution to reduce the 
invocation times of constraint solver, and hence extend the 
scalability of symbolic execution. SSE attacks the constraint 
solving overhead, which is almost always the most domi- 
nant in the running time of symbolic execution. We have 
proposed the speculative DFS algorithm and discussed its 
effectiveness. We also propose a key optimization technique, 
named absurdity based optimization, to further improve 
SSE. This optimization is very effective especially for the 
programs with a high ratio of infeasible branches. 

We have implemented SSE and our optimization tech- 
nique on top of SPF. Experiments have been conducted 
to investigate several important research questions. The 
experimental results on six programs show that, SSE can 
reduce the invocation times of constraint solver by 21% to 
49% (with a medium of 30%), and save the search time 
from 23.6% to 43.6% (with a medium of 30%). For future 
work, we plan to research on different search styles and 
use existing query optimization techniques to enhance SSE 
further. 
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VII. Appendix A 

A. Proof of Proposition 2 

We define the height of an execution tree as the number 
of branches in the longest path in the tree. Let Tr n be an 
arbitrary execution tree with height n. Let T p (Tr n ) and 
T (Tr n ) be the times of constraint solving in performing 
pure SE and speculative DFS (SP-DFS) on Tr n respectively, 
where k is the max speculation depth. Proposition 2 claims 
that for any Tr n , T k (Tr n ) > 1/2 x TP(Tr n ). We use 
induction on the height of the execution tree to prove this 
proposition. 

Basis: Fon n = 1, as shown in Figure 12, there are three 
possible shapes for Tr\. In each case, both of SP-DFS and 
pure SE need two times of constraint solving. So Proposition 
2 holds for n = 1. 

Induction step: Suppose that Proposition 2 holds for 
Tr n , i.e., T k {Tr n ) > 1/2 x TP(Tr n ), we now show 
T k {Tr n+1 ) > 1/2 x TP(Tr n+1 ). 

As shown in Figure 13, Tr„ + i can be regarded as 
constructed by adding a level of branches to some leaves 
(at least one) of Tr n . Let e be a leaf of Tr n , as shown in 
Figure 13, e can be extended in three different cases. Let 
bi and b r (at least one is feasible) be the two new branches 
under e. In pure SE, each of bi and b r need one time of 
constraint solving, no matter the branch is feasible or not. 
Now we analyze what difference these new branches bring 
to the times of constraint solving between exploration of 
Tr n and Tr n+1 by SP-DFS. 

Case 1: bi and b r are both feasible. In SP-DFS, if 
point A (In Figure 13) reaches the max speculation depth 
when exploring Tr n , then in exploration on Tr n +i, a new 
speculation segment starts at 6/, so 6/ consumes one time 
of constraint solving. If point A does not reach the max 
speculation depth, then b\ will be included in the same 
speculation segment with the branch above point A. Since 
bi is satisfiable, it brings no extra constraint solving. What's 
more, to complete b r , SP-DFS backtracks to point A and 
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Figure 12. Execution Tree With Height 1 
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Figure 13. Execution Tree With Height n + 1 



spends one time of constraint solving. In summary, the two 
new branches brings 1 or 2 extra times of constraint solving 
in SP-DFS. 

Case 2: 6/ is feasible and b r is infeasible. The argument 
is similar to case 1, except that b r is unsatisfiable. 

Case 3: bi is infeasible and b r is feasible. If point A 
reaches the max speculation depth when exploring Tr n , then 
when exploring Tr n+ i, one time of constraint solving is 
needed for &/. Otherwise, when exploring Tr n+ i, bi will be 
included in the same speculation segment with the branch 
above point A. This speculation fails because bi is infeasible, 
and needs [log 2 k~\ or jlog 2 k~\ + 1 times of constraint solving 
in backtracking. For b r , one time of solving is needed. In 
summary, 6/ and b r bring [log 2 k~\ + 1 or |~log 2 k~\ + 2 times 
of constraint solving. 

In summary, for each of the three cases above, b r always 
needs one time of constraint solving, and bi brings or more 
times. Since bi and b r need 2 times of constraint solving in 
pure SE, therefore the increased times of constraint solving 
in SP-DFS is larger than half of that in pure SE. According 
to the induction hypothesis, T k (Tr n ) > 1/2 x T' } (Tr n ), we 
can get T k (Tr n+1 ) > 1/2 x TP(Tr n+1 ). 

We claim that proposition 2 holds for an arbitray execu- 
tion tree. 

B. Proof of Equation (1) 

Let Tree n be a full binary execution tree of height n. 
Let T k {Tree n ) be the times of constraint solving in SP- 
DFS, where k is the max speculation depth. There are the 
following two cases: 

Case 1: n < k. 

When n < k, speculation segments always terminate be- 
cause of path ending. As a result, constraint solving always 
occurs at the end of a path, which makes the invocation 
times of constraint solver equal to the number of paths. So 
we get 

T k {Tree n ) = 2"(n < fc) (A.2) 
Case 2: n > k. 

Now we calculate the relation between T k (Tree n ) and 
T k (Tree n+1 ). 

As shown in Figure 14, Tree n+ i is composed of two 
symmetrical subtrees, say 7>ee^ +1 and Tree^ +1 respec- 
tively, each of which is constructed by adding a branch on 
top of a Tree n . 

According to the SP-DFS procedure, it is easy to know 
T k (Tree^ +l ) = T k (Tree% +1 ) and T k (Tree n+1 ) = 2 x 
T k (Tree^ +1 ). Therefore, to calculate the relation between 
T k (Tree n ) and T k (Tree n+ \), it suffices to know the rela- 
tion between T k (Tree„) and T k (Tree^ +1 ). 

As shown in Figure 15, Tree^ l+1 consists of a Tree n 
and a branch on the top. Now we focus on what difference 
this new branch brings to the times of constraint solving in 
exploring Tree^ +1 and Tree n by SP-DFS. 
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equation, which quantifies the times of constraint solving in 
performing SP-DFS on Tree n . 

!2 n (n < k) 

2 n _ o(n%fc) (A.5) 
2 "+ 2k _ X (">*) 



Figure 14. A Full Binary Tree With Height of n + 1 
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Fi gure 15. Impact of Adding a Branch on Top of Tvee-n 



For Tree n , SP-DFS starts at point A (in Figure 15). The 
first completed path in Tree n is the leftmost one, say Pi, 
which is marked with thick arrows in Figure 15. It is easy 
to know that when exploring Pi, a total of \n/k] times 
of constraint solving are needed. Besides, the lengths of 
these speculation segments are all k except the last one, 
which includes n%k branches. After complete Pi, the search 
procedure backtracks to point C and starts to traverse the 
other part of Tree n . 

For Tree^ +1 , SP-DFS starts at point B. The leftmost path 
is still the first to complete, needing a total of \(n + l)/k~\ 
times of constraint solving. After completing the leftmost 
path, SP-DFS backtracks to the point C and continues. It 
is clear that after point C, SP-DFS performs identically as 
that in exploring Tree n , because the backtracking points 
separate the the leftmost path and the other parts of the 
tree. Therefore, the only difference in exploring Tree^ +1 
happens on the leftmost path. 

Since 

{ \n/k] (n%k ± 0) 
(n + 1 /fc =< (A3) 
\ \n/k] + 1 (n%k = 0) 

The relation between T k (Tree n+1 ) and T k (Tree n ) can be 
quantified by the following equation: 



, T k (Tree n ) (n%k ? 0) 

T k (Tree^ ,) = < , (A.4) 
" _L ^ T k (Tree n ) + l (n%k = 0) 

From Equation A.4, it is not hard to get the following 



